Text-based unstressed syllable prediction in Mandarin

نویسندگان

  • Ya Li
  • Jianhua Tao
  • Meng Zhang
  • Shifeng Pan
  • Xiaoying Xu
چکیده

Recently, an increasing attention has been paid to Mandarin word stress which is important for improving the naturalness of speech synthesis. Most of the research on Mandarin speech synthesis focuses on three stress levels: stressed, regular and unstressed. This paper emphasizes the unstressed syllable prediction because the unstressed syllable is also important to the intelligibility of the synthetic speech. Similar as the prosodic structure, it is not easy to detect stress from text analysis due to the complicated context information. A method based on Classification and Regression Tree (CART) model has been proposed to predict the unstressed syllables with the high accuracy of 85%. The method has been finally applied into the TTS system. The experiment shows that the MOS score of synthetic speech has been improved by 0.35; the pitch contour of the new synthesized speech is also closer to natural speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mandarin Chinese Tonal Issues from the Perspective of Speech Synthesis

This paper presents two tonal issues in spoken Mandarin Chinese from the perspective of speech synthesis. One is a unique Chinese phonetic category Qingsheng ( ). Based on speech synthesis and natural speech analysis, two acoustic criteria were suggested for distinguishing Qingsheng from the unstressed syllables which occur frequently in natural speech. The other is a tone sandhi phenomenon whi...

متن کامل

Acoustic characteristics of English lexical stress produced by native Mandarin speakers.

Native speakers of Mandarin Chinese have difficulty producing native-like English stress contrasts. Acoustically, English lexical stress is multidimensional, involving manipulation of fundamental frequency (F0), duration, intensity and vowel quality. Errors in any or all of these correlates could interfere with perception of the stress contrast, but it is unknown which correlates are most probl...

متن کامل

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results s...

متن کامل

Features of contracted syllables of spontaneous Mandarin

Mandarin is a syllable-timed language whose syllable structure is quite simple [1]. In spontaneous Mandarin, because of rapid speech rate the structure of syllable may be changed, phonemes may be reduced and syllable boundaries as well as lexical tones may be merged. This fact has long been noticed, but no quantified empirical data were actually presented in the literature until now. This paper...

متن کامل

Duration Prediction in Mandarin TTS System

This paper reports the methodology and results of decision tree based duration prediction for a Mandarin text-to-speech system developed by the Fujitsu Laboratories. Syllable initials and finals are the basic units in this duration study. Factors influencing finals duration such as phrase boundary and phone context are discussed in detail. Experiments indicate that it is the most important dete...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010